[Master to feature] Merge master to feature branch#6961
Merged
minglumlu merged 129 commits intofeature/trusted-certsfrom Mar 20, 2026
Merged
[Master to feature] Merge master to feature branch#6961minglumlu merged 129 commits intofeature/trusted-certsfrom
minglumlu merged 129 commits intofeature/trusted-certsfrom
Conversation
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Add initial version
We never 'throw' exceptions between threads so we can use thread-local backtrace tables to avoid too much contention / having to resize the global table. Users must wrap all their threads in a Backtrace.with_backtraces (fun () -> ... ) Failure to do so will generate a backtrace with an error in it (therefore it should be obvious that this wasn't done) Signed-off-by: David Scott <dave.scott@eu.citrix.com>
We use Hashtbl.add and then Hashtbl.remove, but always adding a reference to the same record. We use the size of the bindings list in the hashtable as the reference count. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Use thread-local backtrace tables
We need a way to construct backtraces from data sent from other languages. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Add support for Interoperating with other languages
We take the hit of parsing the OCaml < 4.02 stacktraces safe in the knowledge that we can optimise this away (with optcomp) later. This also means we take full control over the rendering of stacktraces and can make the python and OCaml backtraces look the same. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Store backtraces as lists of records rather than strings
All backtraces have been off by one... 0/9 xapi @ renoir Raised at file "db_rpc_client_v1.ml", line 33, characters 14-39 ... 8/9 xapi @ renoir Called from file "lib/backtrace.ml", line 150, characters 17-21 Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Correct backtrace ordering by indexing from 1
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Enable Travis using ocaml-travisci-skeleton
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
Signed-off-by: David Scott <dave@recoil.org>
Signed-off-by: David Scott <dave@recoil.org>
Release 0.3
Signed-off-by: Marcello Seri <marcello.seri@citrix.com>
Merge pp-ely into master for moving to OCaml 4.02.3 and PPX
Run Travis build on all branches, but only upload docs on the master branch. Signed-off-by: Gabor Igloi <gabor.igloi@citrix.com>
Fix travis build; update opam file
Signed-off-by: Gabor Igloi <gabor.igloi@citrix.com>
Also requires adjusting the test output, because line numbers have changed. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Instead of external. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
xe `--trace` has existed since ~2014, but it isn't documented in `--help`, and therefore not well known. Also it only worked on a single host, limiting its usefulness in a pool. However propagating backtraces between XAPIs in a pool is doable, by using the already existing `backtrace` field in the Task object. Having working cross-host backtraces appears to have been the original design goal in [doc/content/design/backtraces.md](https://github.com/xapi-project/xen-api/blob/master/doc/content/design/backtraces.md) In theory this should also work cross-language, with python SM backends, however some plumbing is missing there: it currently doesn't work with either SMAPIv1 or SMAPIv3. Fixing that should be the topic of another PR (by someone else).
Generally an ejected host can be considered as a fresh-installed one. But in practice, the update level (the hash) is useful to determine the update state of an ejected host. One of the cases is a host was ejected from a pool. In this case, the retained applied update hash is useful to determine if the host can join the pool again easily.
To make changes to the backtrace library we need to first import it into XAPI. Used `git subtree` to import it with its full history. Follow-up pull-requests will update it to: * use new functions from Printexc (introduced in 4.02 and 4.11) that avoids parsing the strings * deduplicate entries * print function names and characters, not just line numbers Eventually it could also be updated to capture a backtrace automatically (e.g. a `Backtrace.try_with` function). A chainbuild passed using this PR, together with another internal PR that drops references to the external `xapi-backtrace` module from other internal packages.
Writing code that calls XAPI functions is quite tedious, because you have to repeat `~rpc ~session_id` every time. It saves quite a lot of typing to write in this style instead: ``` open Client.Client ... let value = call t @@ VM.maximise_memory ~self ~approximate:false ~total in call t @@ VM.set_memory ~self ~value ``` You still need to repeat `call t @@`, but it is at the beginning and doesn't hinder readability. Add new types and `val call` to Client.Client. The type is called `client` instead of `t` because it isn't used uniformly by other functions in this module. No functional change to the product. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
This uses a previously unused field in the log message format to log the Trace Context. This include the Trace ID (common for the entire tree of activities), and parent Span ID (unique to this instance of the remote caller). We don't log the local span/parent ID, since this will keep changing. Logging the traceparent could make it easier to group log messages belonging to the same high level activity. When an external Trace Context is not available (the default) then the log messages are unchanged. Another alternative would be to explicitly pass a scope/context to the logging functions, but this would require some automated rewriting of the codebase to plumb through the required parameters. With the ambient context the change is much smaller, and we can still plumb through an explicit context later if needed. To avoid a dependency cycle this is not using Threadext, but Ambient_context directly. The first user of this will be the new quicktest. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
This will build upon the upstream Opentelemetry library, so we can gradually move the existing Tracing library over. The upstream library supports Logs and Metrics too, not just Traces. For now this lives inside quicktest, eventually it should be moved into our tracing library. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Some quicktests may run for a long time, and we don't want to run out of memory if they keep creating events/logs/metrics on the same span. This uses a Queue internally, so that we can drop the oldest element when full. Could've used a ringbuffer, but that would've increased per-span memory usage a lot. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
The backend is very simple, and may block the caller if the background thread is slow due to I/O. This is not suitable for production use, just for testing (eventually we should use the atomic queue we have in Tracing_export *) No functional change. Can be imported into a local Jaeger instance like this: ``` curl -v localhost:4318/v1/traces --data-binary @trace.trace.otel -H 'Content-Type: application/x-protobuf' -o x ``` Logs and Metrics are not supported by Jaeger though, so those would have to be imported into another tool. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Until we can upgrade to a newer version of opentelemetry which includes it. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Extends upstream Opentelemetry with convenience functions to record logs and metrics associated with spans. Implements sampling decisions. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
This is a parent based sampler: if the parent is sampled, then so is the current span, otherwise it defaults to recording if a backend is registered. This will allow implementing a tail based span processor that changes the sampling decision when a span fails. For now we have only 1 hardcoded sampler, eventually we might make this configurable. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
This is a Tail-based Sampling Processor. See https://opentelemetry.io/docs/languages/dotnet/traces/tail-based-sampling/ https://opentelemetry.io/docs/concepts/sampling/#tail-sampling Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Wrapper around upstream Trace module using our Scope, and with support for [result]. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
We may want to emit Opentelemetry items to multiple destinations (console, disk, etc.). Implement a Collector.BACKEND functor that forwards all calls to 2 other backends. No functional change. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Currently useful for debugging how the output looks like. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Useful for quicktest_trace. Signed-off-by: Edwin Török <edwin.torok@citrix.com>
…ls to XAPI Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
…tions (#6858) Test that we can fill a host with 1 VM, with N VMs, based on maximise_memory/compute_memory_overhead. Check that the constant factors used in XAPI are correct, e.g. amount of memory used/vcpu. Can be used to validate these PRs: #6855 #6854 There is also a pagetable overhead calculation, but something weird is going on there: ``` [2026-01-22T18:40:49.342348481-00:00|0000000000000000] pagetables,memory_overhead_pages,coeff,vms [2026-01-22T18:40:49.342333285-00:00|0000000000000000] 64,793,12.3906,9223372036854775807 [2026-01-22T18:40:49.342335974-00:00|0000000000000000] 192,1305,6.79688,9223372036854775807 [2026-01-22T18:40:49.342337658-00:00|0000000000000000] 448,2329,5.19866,9223372036854775807 [2026-01-22T18:40:49.342339751-00:00|0000000000000000] 962,4377,4.5499,9223372036854775807 [2026-01-22T18:40:49.342341392-00:00|0000000000000000] 263102,1048827,3.98639,9223372036854775807 [2026-01-22T18:40:49.342343128-00:00|0000000000000000] 526273,2097403,3.98539,9223372036854775807 [2026-01-22T18:40:49.342345071-00:00|0000000000000000] 708913,2825211,3.98527,9223372036854775807 ``` That should be ~4, don't know why it'd be 13, it used to be reliably 4 previously, could be a bug in the test. That'll need further investigation (also there is enough free memory on the host that this underestimate doesn't actually cause a failure, which is also unexpected).
changlei-li
approved these changes
Mar 20, 2026
BengangY
approved these changes
Mar 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.